Multi-Phase Redistribution: A Communication-Efficient Approach to Array Redistributionz

نویسندگان

  • C.-H. Huang
  • J. Ramanujam
  • P. Sadayappan
چکیده

Distributed-memory implementations of several scientific applications require array redistribution. Array redistribution is used in languages such as High Performance Fortran to dynamically change the distribution of arrays across processors. Performing array redistribution incurs two overheads an indexing overhead for determining the set of processors to communicate with and the array elements to be communicated, and a communication overhead for performing the necessary irregular all-to-many personalized communication. In this paper efficient runtime methods for performing array redistribution are presented. To reduce the indexing overhead, precise closed forms for enumerating the processors to communicate with and the array elements to be communicated are developed for two special cases of array redistribution involving blockcyclically distributed arrays. The general array redistribution problem for block-cyclically distributed arrays can be expressed in terms of these special cases. Using the developed closed forms, a distributed algorithm for scheduling the irregular communication for redistribution is developed. The generated schedule eliminates node contention and incurs the least communication overhead. The scheduling algorithm has an asymptotically lower scheduling overhead than techniques presented in the literature. Based on the developed closed forms, a cost model for estimating the communication and the indexing overhead for array redistribution is developed. Using this model, a multi-phase approach for reducing the communication cost for array redistribution is presented. The key idea is to perform the redistribution as a sequence of redistributions such that the total cost of the sequence is less than that of direct redistribution. Algorithms for determining the sequence of intermediate data distributions which minimizes the total redistribution time are developed. Extensions of the developed closed forms and algorithms to perform array redistribution of multi-dimensional arrays are presented. Experimental results on the Cray T3D and IBM-SP2 demonstrate the validity of the developed cost models and the efficacy of the multi-phase approach for array redistribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-phase array redistribution: modeling and evaluation

s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 Table 1: Execution times (ms) for cyclic(s) to cyclic(t) redistribution on 32 processors. other block sizes t. Fig. 3 shows the total times in milliseconds for a cyclic(192) to cyclic(8) redistribution on 32 processors for increasing data sizes. This redistribution corresponds to the cyclic(Y t) to cyclic(t) case with Y = 2...

متن کامل

An Optimal Processor Replacement Scheme for Efficient Communication of Runtime Data Redistribution

AbstractDynamic data distribution is used to enhance data locality and algorithm performance with reducing inter-processor communication in data parallel programs on distributed memory multi-computers. Since the exchange of data is performed at run-time, there is a performance tradeoff between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of ex...

متن کامل

Contention-Free Communication Scheduling for Array Redistribution

Array redistribution is required often in programs on distributed memory parallel computers. It is essential to use ecient algorithms for redistribution, otherwise the performance of the programs may degrade considerably. The redistribution overheads consist of two parts: index computation and interprocessor communication. If there is no communication scheduling in a redistribution algorithm, ...

متن کامل

Irregular Redistribution Scheduling by Partitioning Messages

Dynamic data redistribution enhances data locality and improves algorithm performance for numerous scientific problems on distributed memory multi-computers systems. Regular data distribution typically employs BLOCK, CYCLIC, or BLOCK-CYCLIC(c) to specify array decomposition. Conversely, an irregular distribution specifies an uneven array distribution based on user-defined functions. Performing ...

متن کامل

Digitally Excited Reconfigurable Linear Antenna Array Using Swarm Optimization Algorithms

This paper describes the synthesis of digitally excited pencil/flat top dual beams simultaneously in a linear antenna array constructed of isotropic elements. The objective is to generate a pencil/flat top beam pair using the excitations generated by the evolutionary algorithms. Both the beams share common variable discrete amplitude excitations and differ in variable discrete phase excitations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994